Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mmap large dictionaries in patch-from mode #3486

Merged

Conversation

daniellerozenblit
Copy link
Contributor

@daniellerozenblit daniellerozenblit commented Feb 9, 2023

TLDR

This PR is a response to issue #3324, which requests a lower memory version of zstd --patch-from for large dictionaries.

This PR offers a partial solution by memory-mapping large dictionaries (i.e. dictionary files larger than the user set memory limit) rather than mallocing and copying them entirely into memory. This feature can also be triggered manually, using the flag --mmap-dict or disabled using the flag --no-mmap-dict.

This PR additionally offers some FIO speed improvements by avoiding a superfluous system call to check the dictionary file size for --patch-from as well as avoiding memcpy for all FIO dictionary loading. This is most significant for --patch-from, where dictionaries often can be upwards of 1GB in size, but should offer benefits for all FIO dictionary operations.

Benchmarking

I benchmarked on an Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz with core isolation and turbo disabled. For speed measurements, I ran each scenario five times interleaved and chose the highest result.

Memory-Mapped Dictionaries For Patch-From

Memory Usage

The memory saved by using a memory-mapped dictionary is highly dependent on the scenario. I did not see reduction in max RSS when creating the patch because my environment was not memory constrained and we access all parts of the dictionary during compression. During decompression, memory reduction is highly dependent on the patch size. We see greater savings in cases with larger diffs. Additionally, we see larger differences in max RSS when patching from a larger -> smaller file than when patching from a larger -> smaller file.

Screenshot 2023-03-08 at 6 16 46 PM

Compression Speed

There are some improvements in compression / patch-creation speed for lower compression levels when using mmap vs malloc for dictionary creation. These speed improvements taper out at higher compression levels.

Screenshot 2023-03-09 at 10 32 29 AM

Decompression Speed

There are some improvements in decompression speed across compression levels when using mmap vs malloc. I benchmarked--patch-from decompression with both --mmap-dict and --no-mmap-dict on the kernel source tree tarball (6.0 -> 6.2, which are ~1GB) and found that decompression speed improved by ~40% across a range of compression levels.

Screenshot 2023-03-09 at 11 31 03 AM

FIO Speed Improvements

We previously avoided memcpy-ing dictionaries for --patch-from compression. This PR additionally avoids calling memcpy on dictionary buffers for --patch-from decompression, as well as all non-patch-from compression and decompression calls.

I benchmarked standard --patch-from decompression (without mmap) on the kernel source tree tarball (6.0 -> 6.2, which are ~1GB) and found that decompression speed improved significantly (by ~50-60%) across a range of compression levels.
Screenshot 2023-03-09 at 12 08 26 PM

FIO_adjustParamsForPatchFromMode(prefs, &comprParams, dictSize, ssSize > 0 ? ssSize : maxSrcFileSize, cLevel);
}

mmapDict = 1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just doing this for testing

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@daniellerozenblit daniellerozenblit changed the title Draft: Patch from low memory mode Mmap for large dictionaries Feb 13, 2023
@daniellerozenblit daniellerozenblit changed the title Mmap for large dictionaries Mmap for large dictionaries in patch-from mode Feb 13, 2023
@daniellerozenblit daniellerozenblit force-pushed the patch-from-low-memory-mode branch from bc0ead9 to 4373c5a Compare February 13, 2023 15:26
@daniellerozenblit daniellerozenblit changed the title Mmap for large dictionaries in patch-from mode Mmap large dictionaries in patch-from mode Feb 13, 2023
@@ -973,10 +1041,19 @@ static cRess_t FIO_createCResources(FIO_prefs_t* const prefs,
/* need to update memLimit before calling createDictBuffer
* because of memLimit check inside it */
if (prefs->patchFromMode) {
U64 const dictSize = UTIL_getFileSize(dictFileName);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reorganize this so that we aren't stat-ing a file multiple times for patch-from.

@daniellerozenblit daniellerozenblit force-pushed the patch-from-low-memory-mode branch from e42fcb6 to 2d8afd9 Compare February 14, 2023 17:42
@jluebbe
Copy link

jluebbe commented Feb 22, 2023

On 32 bit systems, mmap() of files approaching 1GB can be difficult. Would it be possible to (probably optionally) read() the necessary data from the --patch-from file as needed? The performance would obviously be lower, but it would scale to patching very large files (such as disk images).

@Cyan4973
Copy link
Contributor

On 32 bit systems, mmap() of files approaching 1GB can be difficult. Would it be possible to (probably optionally) read() the necessary data from the --patch-from file as needed? The performance would obviously be lower, but it would scale to patching very large files (such as disk images).

It's complex enough to deserve being a separate development effort.

@daniellerozenblit daniellerozenblit marked this pull request as ready for review February 27, 2023 19:45
programs/fileio.c Outdated Show resolved Hide resolved
programs/zstdcli.c Outdated Show resolved Hide resolved
programs/fileio.c Outdated Show resolved Hide resolved
@Cyan4973
Copy link
Contributor

Cyan4973 commented Mar 7, 2023

The code looks good, I only have some fairly minor comments.

The code should result in some memory savings when triggered by the proper scenario. For example, using --patch-from with a 1 GB reference file.
Could you document the memory benefits, and eventually the performance costs, from using this patch, in the description of the PR ?

} else if (dict->dictBufferType == FIO_mmapDict) {
FIO_munmap(dict->dictBuffer, dict->dictBufferSize);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: I would add an else assert(0); at the end,
because such a case is supposed to never happen,
unless the FIO_Dict_t code is changed in the future.

Copy link
Contributor

@Cyan4973 Cyan4973 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great !

Just a minor defensive-programming comment, and it's basically good to go !

@daniellerozenblit daniellerozenblit force-pushed the patch-from-low-memory-mode branch from 4561522 to 70850eb Compare March 9, 2023 00:55
@daniellerozenblit daniellerozenblit merged commit e0fc9fd into facebook:dev Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants